Search CORE

105 research outputs found

RNA Accessibility in cubic time

Author: A Bompfünewerer
A Busch
A Serganov
DH Turner
H Tafer
H Tafer
IL Hofacker
Ivo L Hofacker
J Hackermüller
JS McCaskill
M Kertesz
M Kertesz
P Wikström
Stephan H Bernhart
U Mückstein
U Mückstein
Ullrike Mückstein
Y Ding
Y Ding
Publication venue: BioMed Central
Publication date: 01/03/2011
Field of study

Abstract Background The accessibility of RNA binding motifs controls the efficacy of many biological processes. Examples are the binding of miRNA, siRNA or bacterial sRNA to their respective targets. Similarly, the accessibility of the Shine-Dalgarno sequence is essential for translation to start in prokaryotes. Furthermore, many classes of RNA binding proteins require the binding site to be single-stranded. Results We introduce a way to compute the accessibility of all intervals within an RNA sequence in <inline-formula><graphic file="1748-7188-6-3-i1.gif"/></inline-formula>(<it>n</it>3) time. This improves on previous implementations where only intervals of one defined length were computed in the same time. While the algorithm is in the same efficiency class as sampling approaches, the results, especially if the probabilities get small, are much more exact. Conclusions Our algorithm significantly speeds up methods for the prediction of RNA-RNA interactions and other applications that require the accessibility of RNA molecules. The algorithm is already available in the program RNAplfold of the ViennaRNA package.</p

Crossref

Directory of Open Access Journals

PubMed Central

Permanent Hosting, Archiving and Indexing of Digital Resources and Assets

Polynomial algorithms for the Maximal Pairing Problem: efficient phylogenetic targeting on arbitrary trees

Author: A Purvis
C Arnold
C Arnold
C Arnold
Christian Arnold
CJ Vinyard
CL Nunn
DD Ackerly
E Rothenberg
H Gabow
HN Gabow
HN Gabow
J Felsenstein
JG Burleigh
JS McCaskill
MJ Sanderson
NB Goodwin
NLR Poff
OR Bininda-Emonds
P Steffen
Peter F Stadler
U Mückstein
WP Maddison
Z Galil
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: The Maximal Pairing Problem (MPP) is the prototype of a class of combinatorial optimization problems that are of considerable interest in bioinformatics: Given an arbitrary phylogenetic tree T and weights ωxy for the paths between any two pairs of leaves (x, y), what is the collection of edge-disjoint paths between pairs of leaves that maximizes the total weight? Special cases of the MPP for binary trees and equal weights have been described previously; algorithms to solve the general MPP are still missing, however. Results: We describe a relatively simple dynamic programming algorithm for the special case of binary trees. We then show that the general case of multifurcating trees can be treated by interleaving solutions to certain auxiliary Maximum Weighted Matching problems with an extension of this dynamic programming approach, resulting in an overall polynomial-time solution of complexity (n^4 log n) w.r.t. the number n of leaves. The source code of a C implementation can be obtained under the GNU Public License from http://www.bioinf.uni-leipzig.de/Software/Targeting. For binary trees, we furthermore discuss several constrained variants of the MPP as well as a partition function approach to the probabilistic version of the MPP. Conclusions: The algorithms introduced here make it possible to solve the MPP also for large trees with high-degree vertices. This has practical relevance in the field of comparative phylogenetics and, for example, in the context of phylogenetic targeting, i.e., data collection with resource limitations.Human Evolutionary Biolog

CiteSeerX

Crossref

Harvard University - DASH

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Permanent Hosting, Archiving and Indexing of Digital Resources and Assets

RNA secondary structure prediction from multi-aligned sequences

It has been well accepted that the RNA secondary structures of most functional non-coding RNAs (ncRNAs) are closely related to their functions and are conserved during evolution. Hence, prediction of conserved secondary structures from evolutionarily related sequences is one important task in RNA bioinformatics; the methods are useful not only to further functional analyses of ncRNAs but also to improve the accuracy of secondary structure predictions and to find novel functional RNAs from the genome. In this review, I focus on common secondary structure prediction from a given aligned RNA sequence, in which one secondary structure whose length is equal to that of the input alignment is predicted. I systematically review and classify existing tools and algorithms for the problem, by utilizing the information employed in the tools and by adopting a unified viewpoint based on maximum expected gain (MEG) estimators. I believe that this classification will allow a deeper understanding of each tool and provide users with useful information for selecting tools for common secondary structure predictions.Comment: A preprint of an invited review manuscript that will be published in a chapter of the book `Methods in Molecular Biology'. Note that this version of the manuscript may differ from the published versio

arXiv.org e-Print Archive

CiteSeerX

Crossref

An analysis of simple computational strategies to facilitate the design of functional molecular information processors

Author: DG Bausch
DH Mathews
DH Mathews
DR Patrick
Effirul I. Ramlan
EI Ramlan
FJ Isaacs
GA Prody
GA Soukup
HS Ong
J Macdonald
JE Poje
JS McCaskill
K Kruger
L Qian
LA Allison
M Qiu
MN Stojanovic
MN Stojanovic
MN Win
Mohd Firdaus-Raih
NE Moyer
R Lorenz
R Penchovsky
Rozieffa Roslan
S Wuchty
Shariza Azizan
SK Gire
Yiling Lee
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

BACKGROUND: Biological macromolecules (DNA, RNA and proteins) are capable of processing physical or chemical inputs to generate outputs that parallel conventional Boolean logical operators. However, the design of functional modules that will enable these macromolecules to operate as synthetic molecular computing devices is challenging. RESULTS: Using three simple heuristics, we designed RNA sensors that can mimic the function of a seven-segment display (SSD). Ten independent and orthogonal sensors representing the numerals 0 to 9 are designed and constructed. Each sensor has its own unique oligonucleotide binding site region that is activated uniquely by a specific input. Each operator was subjected to a stringent in silico filtering. Random sensors were selected and functionally validated via ribozyme self cleavage assays that were visualized via electrophoresis. CONCLUSIONS: By utilising simple permutation and randomisation in the sequence design phase, we have developed functional RNA sensors thus demonstrating that even the simplest of computational methods can greatly aid the design phase for constructing functional molecular devices. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1297-x) contains supplementary material, which is available to authorized users

Crossref

Springer - Publisher Connector

PubMed Central

Ulster University's Research Portal

FigShare

Multilevel Selection in Models of Prebiotic Evolution II: A Direct Comparison of Compartmentalization and Spatial Self-Organization

Multilevel selection has been indicated as an essential factor for the evolution of complexity in interacting RNA-like replicator systems. There are two types of multilevel selection mechanisms: implicit and explicit. For implicit multilevel selection, spatial self-organization of replicator populations has been suggested, which leads to higher level selection among emergent mesoscopic spatial patterns (traveling waves). For explicit multilevel selection, compartmentalization of replicators by vesicles has been suggested, which leads to higher level evolutionary dynamics among explicitly imposed mesoscopic entities (protocells). Historically, these mechanisms have been given separate consideration for the interests on its own. Here, we make a direct comparison between spatial self-organization and compartmentalization in simulated RNA-like replicator systems. Firstly, we show that both mechanisms achieve the macroscopic stability of a replicator system through the evolutionary dynamics on mesoscopic entities that counteract that of microscopic entities. Secondly, we show that a striking difference exists between the two mechanisms regarding their possible influence on the long-term evolutionary dynamics, which happens under an emergent trade-off situation arising from the multilevel selection. The difference is explained in terms of the difference in the stability between self-organized mesoscopic entities and externally imposed mesoscopic entities. Thirdly, we show that a sharp transition happens in the long-term evolutionary dynamics of the compartmentalized system as a function of replicator mutation rate. Fourthly, the results imply that spatial self-organization can allow the evolution of stable folding in parasitic replicators without any specific functionality in the folding itself. Finally, the results are discussed in relation to the experimental synthesis of chemical Darwinian systems and to the multilevel selection theory of evolutionary biology in general. To conclude, novel evolutionary directions can emerge through interactions between the evolutionary dynamics on multiple levels of organization. Different multilevel selection mechanisms can produce a difference in the long-term evolutionary trend of identical microscopic entities

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Directed acyclic graph kernels for structural RNA analysis

Author: B Knudsen
B Schölkopf
CB Do
D Haussler
D Sankoff
DB Searls
DM Tax
E Rivas
EK Freyhult
H Kiryu
H Saigo
I Holmes
IL Hofacker
IL Hofacker
J Hertel
J Hertel
JD Thompson
JS McCaskill
JS Pedersen
JW Brown
K Sato
Kengo Sato
Kiyoshi Asai
MA Rosenblad
P Pacheco
RD Dowell
RE Fan
RJ Klein
S Washietl
S Washietl
S Will
SR Eddy
SR Eddy
SR Eddy
T Babak
T Kin
Toutai Mituyama
W Deng
Y Sakakibara
Y Sakakibara
Y Sakakibara
Yasubumi Sakakibara
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Recent discoveries of a large variety of important roles for non-coding RNAs (ncRNAs) have been reported by numerous researchers. In order to analyze ncRNAs by kernel methods including support vector machines, we propose stem kernels as an extension of string kernels for measuring the similarities between two RNA sequences from the viewpoint of secondary structures. However, applying stem kernels directly to large data sets of ncRNAs is impractical due to their computational complexity. Results We have developed a new technique based on directed acyclic graphs (DAGs) derived from base-pairing probability matrices of RNA sequences that significantly increases the computation speed of stem kernels. Furthermore, we propose profile-profile stem kernels for multiple alignments of RNA sequences which utilize base-pairing probability matrices for multiple alignments instead of those for individual sequences. Our kernels outperformed the existing methods with respect to the detection of known ncRNAs and kernel hierarchical clustering. Conclusion Stem kernels can be utilized as a reliable similarity measure of structural RNAs, and can be used in various kernel-based applications.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Efficient Algorithms for Probing the RNA Mutation Landscape

Author: A Coventry
A Omer
A Serganov
AO Harmanci
B Baker
B Knudsen
Bonnie Berger
C Reidys
C Thurner
Consortium ENCODE Project
D Barash
D Mathews
DH Mathews
E Rivas
I Hofacker
I Hofacker
I Miklos
IL Hofacker
IM Meyer
J Waldispuhl
J Waldispuhl
JS McCaskill
JS Pedersen
JS Weinger
Jérôme Waldispühl
M Yanagi
M Yang
M Zuker
M Zuker
MC Cowperthwaite
MC Cowperthwaite
MT Cheah
NM Cuceanu
P Clote
P Schuster
P Schuster
Peter Clote
PP Gardner
R Nussinov
RA Dimitrov
RD Dowell
S Brown
S Griffiths-Jones
S Griffiths-Jones
S You
SH Bernhart
Srinivas Devadas
T Kulinski
T Xia
Uwe Ohler
V Ambros
W Fontana
W Grüner
W Shu
Y Ding
Y Ding
Y Ding
Y Ponty
Publication venue: Public Library of Science
Publication date: 08/08/2008
Field of study

The diversity and importance of the role played by RNAs in the regulation and development of the cell are now well-known and well-documented. This broad range of functions is achieved through specific structures that have been (presumably) optimized through evolution. State-of-the-art methods, such as McCaskill's algorithm, use a statistical mechanics framework based on the computation of the partition function over the canonical ensemble of all possible secondary structures on a given sequence. Although secondary structure predictions from thermodynamics-based algorithms are not as accurate as methods employing comparative genomics, the former methods are the only available tools to investigate novel RNAs, such as the many RNAs of unknown function recently reported by the ENCODE consortium. In this paper, we generalize the McCaskill partition function algorithm to sum over the grand canonical ensemble of all secondary structures of all mutants of the given sequence. Specifically, our new program, RNAmutants, simultaneously computes for each integer k the minimum free energy structure MFE(k) and the partition function Z(k) over all secondary structures of all k-point mutants, even allowing the user to specify certain positions required not to mutate and certain positions required to base-pair or remain unpaired. This technically important extension allows us to study the resilience of an RNA molecule to pointwise mutations. By computing the mutation profile of a sequence, a novel graphical representation of the mutational tendency of nucleotide positions, we analyze the deleterious nature of mutating specific nucleotide positions or groups of positions. We have successfully applied RNAmutants to investigate deleterious mutations (mutations that radically modify the secondary structure) in the Hepatitis C virus cis-acting replication element and to evaluate the evolutionary pressure applied on different regions of the HIV trans-activation response element. In particular, we show qualitative agreement between published Hepatitis C and HIV experimental mutagenesis studies and our analysis of deleterious mutations using RNAmutants. Our work also predicts other deleterious mutations, which could be verified experimentally. Finally, we provide evidence that the 3′ UTR of the GB RNA virus C has been optimized to preserve evolutionarily conserved stem regions from a deleterious effect of pointwise mutations. We hope that there will be long-term potential applications of RNAmutants in de novo RNA design and drug design against RNA viruses. This work also suggests potential applications for large-scale exploration of the RNA sequence-structure network. Binary distributions are available at http://RNAmutants.csail.mit.edu/

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

AUG_hairpin: prediction of a downstream secondary structure influencing the recognition of a translation start site

Author: Akinori Sarai
Alex V Kochetov
Andrey Palyanov
AV Kochetov
AV Kochetov
AV Kochetov
AV Pisarev
C Touriol
Dmitry Grigorovich
HS Kwon
I Ventoso
IB Rogozin
Igor I Titov
IL Hofacker
IM Meyer
JL Riechmann
JS McCaskill
K Clyde
K Takahashi
K-N Zhao
L Yang
M Ciullo
M Kozak
M Kozak
M Kozak
M Lukaszewicz
M Nguyen
Nikolay A Kolchanov
RJ Jackson
SA Shabalina
SA Shabalina
SD Baird
SV Sawant
W-L Hwang
Y Kobayashi
Publication venue: BioMed Central
Publication date: 01/08/2007
Field of study

Abstract Background The translation start site plays an important role in the control of translation efficiency of eukaryotic mRNAs. The recognition of the start AUG codon by eukaryotic ribosomes is considered to depend on its nucleotide context. However, the fraction of eukaryotic mRNAs with the start codon in a suboptimal context is relatively large. It may be expected that mRNA should possess some features providing efficient translation, including the proper recognition of a translation start site. It has been experimentally shown that a downstream hairpin located in certain positions with respect to start codon can compensate in part for the suboptimal AUG context and also increases translation from non-AUG initiation codons. Prediction of such a compensatory hairpin may be useful in the evaluation of eukaryotic mRNA translation properties. Results We evaluated interdependency between the start codon context and mRNA secondary structure at the CDS beginning: it was found that a suboptimal start codon context significantly correlated with higher base pairing probabilities at positions 13 – 17 of CDS of human and mouse mRNAs. It is likely that the downstream hairpins are used to enhance translation of some mammalian mRNAs <it>in vivo</it>. Thus, we have developed a tool, <it>AUG_hairpin</it>, to predict local stem-loop structures located within the defined region at the beginning of mRNA coding part. The implemented algorithm is based on the available published experimental data on the CDS-located stem-loop structures influencing the recognition of upstream start codons. Conclusion An occurrence of a potential secondary structure downstream of start AUG codon in a suboptimal context (or downstream of a potential non-AUG start codon) may provide researchers with a testable assumption on the presence of additional regulatory signal influencing mRNA translation initiation rate and the start codon choice. <it>AUG_hairpin</it>, which has a convenient Web-interface with adjustable parameters, will make such an evaluation easy and efficient.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework

Author: A Wilm
B Knudsen
BW Matthews
C Notredame
C Notredame
C Notredame
CB Do
CB Do
D Dalli
DF Feng
DH Mathews
E Torarinsson
GJ Barton
H Kiryu
H Kiryu
H Zhou
Hiroyuki Toh
I Holmes
IL Hofacker
J Heringa
J Pei
J Reeder
JD Thompson
JD Thompson
JH Havgaard
JS McCaskill
JS Papadopoulos
K Katoh
K Katoh
Kazutaka Katoh
M Bauer
M Hochsmann
M Kruspe
M Vingron
MA Larkin
MM Yusupov
O Gotoh
O Gotoh
PP Gardner
PP Gardner
S Griffiths-Jones
S Lindgreen
S Washietl
SB Needleman
SR Eddy
TF Smith
VA Simossis
X Xu
Y Tabei
Y Tabei
YM Hou
Z Yao
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Structural alignment of RNAs is becoming important, since the discovery of functional non-coding RNAs (ncRNAs). Recent studies, mainly based on various approximations of the Sankoff algorithm, have resulted in considerable improvement in the accuracy of pairwise structural alignment. In contrast, for the cases with more than two sequences, the practical merit of structural alignment remains unclear as compared to traditional sequence-based methods, although the importance of multiple structural alignment is widely recognized. Results We took a different approach from a straightforward extension of the Sankoff algorithm to the multiple alignments from the viewpoints of accuracy and time complexity. As a new option of the MAFFT alignment program, we developed a multiple RNA alignment framework, X-INS-i, which builds a multiple alignment with an iterative method incorporating structural information through two components: (1) pairwise structural alignments by an external pairwise alignment method such as SCARNA or LaRA and (2) a new objective function, Four-way Consistency, derived from the base-pairing probability of every sub-aligned group at every multiple alignment stage. Conclusion The BRAliBASE benchmark showed that X-INS-i outperforms other methods currently available in the sum-of-pairs score (SPS) criterion. As a basis for predicting common secondary structure, the accuracy of the present method is comparable to or rather higher than those of the current leading methods such as RNA Sampler. The X-INS-i framework can be used for building a multiple RNA alignment from any combination of algorithms for pairwise RNA alignment and base-pairing probability. The source code is available at the webpage found in the Availability and requirements section.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Prediction of RNA secondary structure by maximizing pseudo-expected accuracy

Author: B Knudsen
C Do
D Mathews
H Kiryu
I Hofacker
I Holmes
IL Hofacker
JS McCaskill
K Sato
Kengo Sato
Kiyoshi Asai
L Carvalho
L Kall
M Andronescu
M Andronescu
M Hamada
M Hamada
M Hamada
M Hamada
M Parisien
M Zuker
M Zuker
MC Frith
Michiaki Hamada
N Michal
P Baldi
PP Gardner
R Durbin
RK Bradley
RK Bradley
S Bernhart
S Engelen
S Griffiths-Jones
S Gross
S Seemann
SJ Schroeder
Y Ding
Y Ding
Y Ding
ZJ Lu
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Recent studies have revealed the importance of considering the entire distribution of possible secondary structures in RNA secondary structure predictions; therefore, a new type of estimator is proposed including the maximum expected accuracy (MEA) estimator. The MEA-based estimators have been designed to maximize the expected accuracy of the base-pairs and have achieved the highest level of accuracy. Those methods, however, do not give the single best prediction of the structure, but employ parameters to control the trade-off between the sensitivity and the positive predictive value (PPV). It is unclear what parameter value we should use, and even the well-trained default parameter value does not, in general, give the best result in popular accuracy measures to each RNA sequence. Results Instead of using the expected values of the popular accuracy measures for RNA secondary structure prediction, which is difficult to be calculated, the <it>pseudo</it>-expected accuracy, which can easily be computed from base-pairing probabilities, is introduced. It is shown that the pseudo-expected accuracy is a good approximation in terms of sensitivity, PPV, MCC, or F-score. The pseudo-expected accuracy can be approximately maximized for each RNA sequence by stochastic sampling. It is also shown that well-balanced secondary structures between sensitivity and PPV can be predicted with a small computational overhead by combining the pseudo-expected accuracy of MCC or F-score with the γ-centroid estimator. Conclusions This study gives not only a method for predicting the secondary structure that balances between sensitivity and PPV, but also a general method for approximately maximizing the (pseudo-)expected accuracy with respect to various evaluation measures including MCC and F-score.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central